DWE: Discriminating Word Enumerator

نویسندگان

  • Pavel Sumazin
  • Gengxin Chen
  • Naoya Hata
  • Andrew D. Smith
  • Theresa Zhang
  • Michael Q. Zhang
چکیده

MOTIVATION Tissue-specific transcription factor binding sites give insight into tissue-specific transcription regulation. RESULTS We describe a word-counting-based tool for de novo tissue-specific transcription factor binding site discovery using expression information in addition to sequence information. We incorporate tissue-specific gene expression through gene classification to positive expression and repressed expression. We present a direct statistical approach to find overrepresented transcription factor binding sites in a foreground promoter sequence set against a background promoter sequence set. Our approach naturally extends to synergistic transcription factor binding site search. We find putative transcription factor binding sites that are overrepresented in the proximal promoters of liver-specific genes relative to proximal promoters of liver-independent genes. Our results indicate that binding sites for hepatocyte nuclear factors (especially HNF-1 and HNF-4) and CCAAT/enhancer-binding protein (C/EBPbeta) are the most overrepresented in proximal promoters of liver-specific genes. Our results suggest that HNF-4 has strong synergistic relationships with HNF-1, HNF-4 and HNF-3beta and with C/EBPbeta. AVAILABILITY Programs are available for use over the Web at http://rulai.cshl.edu/tools/dwe.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Complete weight enumerators of a family of three-weight linear codes

Linear codes have been an interesting topic in both theory and practice for many years. In this paper, for an odd prime p, we present the explicit complete weight enumerator of a family of p-ary linear codes constructed with defining set. The weight enumerator is an immediate result of the complete weight enumerator which shows that the codes proposed in this paper are three-weight linear codes...

متن کامل

The Partition Weight Enumerator and Bounds on Mds Codes

Maximum Distance Separable (MDS) codes are those error correction codes that meet the singleton bound, thus they have the largest minimum distance possible. The main research problem is to find an upper bound on the length of the codewords when the alphabet size and dimension of the code are fixed. This paper will present a new technique using the Partition Weight Enumerator for solving this pr...

متن کامل

Rational Point Counts for del Pezzo Surfaces over Finite Fields and Coding Theory

The goal of this thesis is to apply an approach due to Elkies to study the distribution of rational point counts for certain families of curves and surfaces over finite fields. A vector space of polynomials over a fixed finite field Fq gives rise to a linear code, and the weight enumerator of this code gives information about point count distributions. The MacWilliams theorem gives a relation b...

متن کامل

Information-bit error rate and false positives in an MDS code

In this paper, a computation of the inputredundancy weight enumerator is presented. This is used to improve the theoretical approximation of the information–bit and –symbol error rate, in terms of the channel bit-error rate, in a block transmission through a discrete memoryless channel. Since a bounded distance reproducing encoder is assumed, the computation of the here-called false positive (a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 21 1  شماره 

صفحات  -

تاریخ انتشار 2005